Search CORE

380 research outputs found

Controller Synthesis for Autonomous Systems Interacting With Human Operators

Author: Cooke N. J.
Cummings M. L.
Donath D.
Humphrey L.
Kwiatkowska M.
Kwiatkowska M.
Puterman M.
Publication venue: ScholarlyCommons
Publication date: 01/04/2015
Field of study

We propose an approach to synthesize control protocols for autonomous systems that account for uncertainties and imperfections in interactions with human operators. As an illustrative example, we consider a scenario involving road network surveillance by an unmanned aerial vehicle (UAV) that is controlled remotely by a human operator but also has a certain degree of autonomy. Depending on the type (i.e., probabilistic and/or nondeterministic) of knowledge about the uncertainties and imperfections in the operatorautonomy interactions, we use abstractions based on Markov decision processes and augment these models to stochastic two-player games. Our approach enables the synthesis of operator-dependent optimal mission plans for the UAV, highlighting the effects of operator characteristics (e.g., workload, proficiency, and fatigue) on UAV mission performance; it can also provide informative feedback (e.g., Pareto curves showing the trade-offs between multiple mission objectives), potentially assisting the operator in decision-making

CiteSeerX

Crossref

ScholarlyCommons@Penn

"How May I Help You?": Modeling Twitter Customer Service Conversations Using Fine-Grained Dialogue Acts

Author: Austin J. L.
Bird S.
Bunt H.
Core M. G.
Gasic M.
Kim S. N.
Kim S. N.
Klüwer T.
Lafferty J. D.
Mohammad S. M.
Puterman M. L.
Sacks H.
Searle J. R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/09/2017
Field of study

Given the increasing popularity of customer service dialogue on Twitter, analysis of conversation data is essential to understand trends in customer and agent behavior for the purpose of automating customer service interactions. In this work, we develop a novel taxonomy of fine-grained "dialogue acts" frequently observed in customer service, showcasing acts that are more suited to the domain than the more generic existing taxonomies. Using a sequential SVM-HMM model, we model conversation flow, predicting the dialogue act of a given turn in real-time. We characterize differences between customer and agent behavior in Twitter customer service conversations, and investigate the effect of testing our system on different customer service industries. Finally, we use a data-driven approach to predict important conversation outcomes: customer satisfaction, customer frustration, and overall problem resolution. We show that the type and location of certain dialogue acts in a conversation have a significant effect on the probability of desirable and undesirable outcomes, and present actionable rules based on our findings. The patterns and rules we derive can be used as guidelines for outcome-driven automated customer service platforms.Comment: 13 pages, 6 figures, IUI 201

arXiv.org e-Print Archive

Crossref

Synchronization and Control in Intrinsic and Designed Computation: An Information-Theoretic Analysis of Competing Models of Stochastic Computation

Author: Christopher J. Ellison
Cover T. M.
Elliott R. J.
Hopcroft J. E.
James P. Crutchfield
John R. Mahoney
Klamka J.
Puterman M. L.
Ryan G. James
Strogatz S.
Publication venue: 'AIP Publishing'
Publication date: 29/07/2010
Field of study

We adapt tools from information theory to analyze how an observer comes to synchronize with the hidden states of a finitary, stationary stochastic process. We show that synchronization is determined by both the process's internal organization and by an observer's model of it. We analyze these components using the convergence of state-block and block-state entropies, comparing them to the previously known convergence properties of the Shannon block entropy. Along the way, we introduce a hierarchy of information quantifiers as derivatives and integrals of these entropies, which parallels a similar hierarchy introduced for block entropy. We also draw out the duality between synchronization properties and a process's controllability. The tools lead to a new classification of a process's alternative representations in terms of minimality, synchronizability, and unifilarity.Comment: 25 pages, 13 figures, 1 tabl

arXiv.org e-Print Archive

Crossref

eScholarship - University of California

Actor-Critic Policy Learning in Cooperative Planning

Author: Bhatnagar S.
Howard R. A.
Murphey R.
Puterman M. L.
Russell S.
Sutton R.S.
Publication venue: 'American Institute of Aeronautics and Astronautics (AIAA)'
Publication date: 01/08/2010
Field of study

In this paper, we introduce a method for learning and adapting cooperative control strategies in real-time stochastic domains. Our framework is an instance of the intelligent cooperative control architecture (iCCA)[superscript 1]. The agent starts by following the "safe" plan calculated by the planning module and incrementally adapting its policy to maximize the cumulative rewards. Actor-critic and consensus-based bundle algorithm (CBBA) were employed as the building blocks of the iCCA framework. We demonstrate the performance of our approach by simulating limited fuel unmanned aerial vehicles aiming for stochastic targets. In one experiment where the optimal solution can be calculated, the integrated framework boosted the optimality of the solution by an average of %10, when compared to running each of the modules individually, while keeping the computational load within the requirements for real-time implementation.Boeing Scientific Research LaboratoriesUnited States. Air Force Office of Scientific Research (Grant FA9550-08-1-0086

DSpace@MIT

Crossref

Chronic psychosocial and financial burden accelerates 5-year telomere shortening: findings from the Coronary Artery Risk Development in Young Adults Study.

Author: AJ Schuit
AK Damjanovic
Aric A. Prather
AT Geronimus
B Mezuk
Barbara Sternfeld
C Duggan
C Schaefer
CG Parks
CK Enders
CM Aldwin
DS Lauderdale
E Puterman
E Puterman
E Puterman
E Puterman
E Puterman
E Sahin
EH Blackburn
Eli Puterman
Elissa S. Epel
ES Epel
ES Epel
ES Epel
GD Friedman
GE Miller
GL Schlomer
H Ma
IM Wentzensen
J Campisi
J Deelen
J Humphreys
J Lin
J Svensson
J Zhao
JE Verhoeven
JE Verhoeven
JJW Liu
JP Gouin
JR Piazza
Jue Lin
K Ahola
K Litzelman
L Ala-Mursula
L Bendix
L Rode
L Rode
LI Pearlin
M Booth
M Hamer
M Jaskelioff
M Kimura
ME Glickman
MF Scheier
MH Schafer
Nancy Adler
OT Njajou
PC Haycock
PG Surtees
RM Cawthon
RM Cawthon
S Cohen
S Cohen
S Cohen
S Cohen
S Jodczyk
S Richardson
S Yusuf
SE Taylor
SE Taylor
SL Bakaysa
ST Charles
T Steenstrup
TC Adam
TE Seeman
Tomás Cabeza de Baca
U Svenson
V Codd
W Chen
W Poortinga
Y Zhan
Publication venue: eScholarship, University of California
Publication date: 01/05/2020
Field of study

Leukocyte telomere length, a marker of immune system function, is sensitive to exposures such as psychosocial stressors and health-maintaining behaviors. Past research has determined that stress experienced in adulthood is associated with shorter telomere length, but is limited to mostly cross-sectional reports. We test whether repeated reports of chronic psychosocial and financial burden is associated with telomere length change over a 5-year period (years 15 and 20) from 969 participants in the Coronary Artery Risk Development in Young Adults (CARDIA) Study, a longitudinal, population-based cohort, ages 18-30 at time of recruitment in 1985. We further examine whether multisystem resiliency, comprised of social connections, health-maintaining behaviors, and psychological resources, mitigates the effects of repeated burden on telomere attrition over 5 years. Our results indicate that adults with high chronic burden do not show decreased telomere length over the 5-year period. However, these effects do vary by level of resiliency, as regression results revealed a significant interaction between chronic burden and multisystem resiliency. For individuals with high repeated chronic burden and low multisystem resiliency (1 SD below the mean), there was a significant 5-year shortening in telomere length, whereas no significant relationships between chronic burden and attrition were evident for those at moderate and higher levels of resiliency. These effects apply similarly across the three components of resiliency. Results imply that interventions should focus on establishing strong social connections, psychological resources, and health-maintaining behaviors when attempting to ameliorate stress-related decline in telomere length among at-risk individuals

Crossref

eScholarship - University of California

Measurement-Adaptive Cellular Random Access Protocols

Author: A. Ephremides
Anastasios Giovanidis
B. Hajek
G. Bianchi
G. del Angel
H. Takagi
L. Kleinrock
M. H. Cheung
M. L. Puterman
P. Gupta
Qi Liao
R. R. Boorstyn
S. Asmussen
S. S. Lam
Sławomir Stańczak
Y. Al Harthi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

This work considers a single-cell random access channel (RACH) in cellular wireless networks. Communications over RACH take place when users try to connect to a base station during a handover or when establishing a new connection. Within the framework of Self-Organizing Networks (SONs), the system should self- adapt to dynamically changing environments (channel fading, mobility, etc.) without human intervention. For the performance improvement of the RACH procedure, we aim here at maximizing throughput or alternatively minimizing the user dropping rate. In the context of SON, we propose protocols which exploit information from measurements and user reports in order to estimate current values of the system unknowns and broadcast global action-related values to all users. The protocols suggest an optimal pair of user actions (transmission power and back-off probability) found by minimizing the drift of a certain function. Numerical results illustrate considerable benefits of the dropping rate, at a very low or even zero cost in power expenditure and delay, as well as the fast adaptability of the protocols to environment changes. Although the proposed protocol is designed to minimize primarily the amount of discarded users per cell, our framework allows for other variations (power or delay minimization) as well.Comment: 31 pages, 13 figures, 3 tables. Springer Wireless Networks 201

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Fraunhofer-ePrints

The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes

Author: AL Strehl
C Baier
C Courcoubetis
C Dehnert
Krishnendu Chatterjee
L Valiant
LP Kaelbling
M Kwiatkowska
M Steinmetz
ML Puterman
N Fijalkow
PR D’Argenio
S Fortune
SJ Russell
T Brázdil
T Eilam-Tzoreff
Publication venue
Publication date: 01/01/2018
Field of study

We study the never-worse relation (NWR) for Markov decision processes with an infinite-horizon reachability objective. A state q is never worse than a state p if the maximal probability of reaching the target set of states from p is at most the same value from q, regard- less of the probabilities labelling the transitions. Extremal-probability states, end components, and essential states are all special cases of the equivalence relation induced by the NWR. Using the NWR, states in the same equivalence class can be collapsed. Then, actions leading to sub- optimal states can be removed. We show the natural decision problem associated to computing the NWR is coNP-complete. Finally, we ex- tend a previously known incomplete polynomial-time iterative algorithm to under-approximate the NWR

arXiv.org e-Print Archive

Crossref

Institutional Repository Universiteit Antwerpen

DI-fusion

Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms

Author: A Jovanovic
A Jovanović
C Haase
C Lindemann
DLP Minh
DP Bertsekas
EG Amparore
EM Hahn
H Choi
JR Norris
L Alfaro
L-M Traonouez
M Češka
ML Puterman
PJ Haas
R German
SK Jha
T Brázdil
T Brázdil
W Nelson
Publication venue
Publication date: 20/06/2017
Field of study

Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events that can be non-exponentially distributed. Within parametric ACTMCs, the parameters of alarm-event distributions are not given explicitly and can be subject of parameter synthesis. An algorithm solving the

\varepsilon

-optimal parameter synthesis problem for parametric ACTMCs with long-run average optimization objectives is presented. Our approach is based on reduction of the problem to finding long-run average optimal strategies in semi-Markov decision processes (semi-MDPs) and sufficient discretization of parameter (i.e., action) space. Since the set of actions in the discretized semi-MDP can be very large, a straightforward approach based on explicit action-space construction fails to solve even simple instances of the problem. The presented algorithm uses an enhanced policy iteration on symbolic representations of the action space. The soundness of the algorithm is established for parametric ACTMCs with alarm-event distributions satisfying four mild assumptions that are shown to hold for uniform, Dirac and Weibull distributions in particular, but are satisfied for many other distributions as well. An experimental implementation shows that the symbolic technique substantially improves the efficiency of the synthesis algorithm and allows to solve instances of realistic size.Comment: This article is a full version of a paper accepted to the Conference on Quantitative Evaluation of SysTems (QEST) 201

arXiv.org e-Print Archive

Crossref

Maximizing the Conditional Expected Reward for Reaching the Goal

Author: C Acerbi
C Baier
C Baier
C Baier
DP Bertsekas
F Gretz
G Barthe
G Seber
J-P Katoen
K Chatterjee
K Chatzikokolakis
L Alfaro
L Kallenberg
M Kwiatkowska
M Randour
ME Andrés
ME Andrés
ML Puterman
MS Alvim
T Brázdil
Publication venue
Publication date: 19/01/2017
Field of study

The paper addresses the problem of computing maximal conditional expected accumulated rewards until reaching a target state (briefly called maximal conditional expectations) in finite-state Markov decision processes where the condition is given as a reachability constraint. Conditional expectations of this type can, e.g., stand for the maximal expected termination time of probabilistic programs with non-determinism, under the condition that the program eventually terminates, or for the worst-case expected penalty to be paid, assuming that at least three deadlines are missed. The main results of the paper are (i) a polynomial-time algorithm to check the finiteness of maximal conditional expectations, (ii) PSPACE-completeness for the threshold problem in acyclic Markov decision processes where the task is to check whether the maximal conditional expectation exceeds a given threshold, (iii) a pseudo-polynomial-time algorithm for the threshold problem in the general (cyclic) case, and (iv) an exponential-time algorithm for computing the maximal conditional expectation and an optimal scheduler.Comment: 103 pages, extended version with appendices of a paper accepted at TACAS 201

arXiv.org e-Print Archive

Crossref

The Impatient May Use Limited Optimism to Minimize Regret

Author: B Aminof
C Reutenauer
CJCH Watkins
E Allender
E Filiot
F Cucker
J Filar
JY Halpern
KR Apt
L Alfaro de
LS Shapley
M Jurdzinski
ML Puterman
P Hunter
R Brenguier
U Zwick
Publication venue
Publication date: 17/11/2018
Field of study

Discounted-sum games provide a formal model for the study of reinforcement learning, where the agent is enticed to get rewards early since later rewards are discounted. When the agent interacts with the environment, she may regret her actions, realizing that a previous choice was suboptimal given the behavior of the environment. The main contribution of this paper is a PSPACE algorithm for computing the minimum possible regret of a given game. To this end, several results of independent interest are shown. (1) We identify a class of regret-minimizing and admissible strategies that first assume that the environment is collaborating, then assume it is adversarial---the precise timing of the switch is key here. (2) Disregarding the computational cost of numerical analysis, we provide an NP algorithm that checks that the regret entailed by a given time-switching strategy exceeds a given value. (3) We show that determining whether a strategy minimizes regret is decidable in PSPACE

arXiv.org e-Print Archive

Crossref

Institutional Repository Universiteit Antwerpen

DI-fusion